A New Algorithm for Cluster Initialization
نویسنده
چکیده
Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique are dependent on the initialization of cluster centers. In this article we propose a new algorithm to initialize the clusters. The proposed algorithm is based on finding a set of medians extracted from a dimension with maximum variance. The algorithm has been applied to different data sets and good results are obtained. Keywords— clustering, k-means, data mining.
منابع مشابه
DEVELOPING A NEW INITIALIZATION PROCEDURE FOR DISTILLATION COLUMN SIMULATION
The simulation of distillation columns is an essential step in design, optimization, and rating. In this paper, a new procedure has been proposed for the initial estimation of column profiles based on modified Kremser’s group method for simple and/or complex columns. The effect of this initialization algorithm on simulation procedure has been studied through two examples. The results show sig...
متن کاملA new algorithm for choosing initial cluster centers for k-means
The k-means algorithm is widely used in many applications due to its simplicity and fast speed. However, its result is very sensitive to the initialization step: choosing initial cluster centers. Different initialization algorithms may lead to different clustering results and may also affect the convergence of the method. In this paper, we propose a new algorithm for improving the initializatio...
متن کاملCluster center initialization algorithm for K-modes clustering
Partitional clustering of categorical data is normally performed by using K-modes clustering algorithm, which works well for large datasets. Even though the design and implementation of K-modes algorithm is simple and efficient, it has the pitfall of randomly choosing the initial cluster centers for invoking every new execution that may lead to non-repeatable clustering results. This paper addr...
متن کاملA Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS
Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...
متن کاملAn improved opposition-based Crow Search Algorithm for Data Clustering
Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CS...
متن کامل